AITopics | ve bayes classifier

Collaborating Authors

ve bayes classifier

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Automated Bug Report Prioritization in Large Open-Source Projects

Pierson, Riley, Moin, Armin

arXiv.org Artificial IntelligenceApr-23-2025

--Large open-source projects receive a large number of issues (known as bugs), including software defect (i.e., bug) reports and new feature requests from their user and developer communities at a fast rate. The often limited project resources do not allow them to deal with all issues. Instead, they have to prioritize them according to the project's priorities and the issues' severities. In this paper, we propose a novel approach to automated bug prioritization based on the natural language text of the bug reports that are stored in the open bug repositories of the issue-tracking systems. We conduct topic modeling using a variant of LDA called T opicMiner-MTM and text classification with the BERT large language model to achieve a higher performance level compared to the state-of-the-art. Experimental results using an existing reference dataset containing 85,156 bug reports of the Eclipse Platform project indicate that we outperform existing approaches in terms of Accuracy, Precision, Recall, and F1-measure of the bug report priority prediction. Index T erms --automated bug prioritization, automated bug triage, mining software repositories, machine learning, natural language processing I. I NTRODUCTION Large open-source projects offer an issue-tracking system with an open bug repository, where developers and users can report the software defects they find or any new feature requests they may have. These reports are called bug reports . However, the projects' resources are limited, while processing and resolving the bug reports is typically very costly. Hence, not all bug reports in the open bug repository can be processed and handled at once.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2504.15912

Country: North America > United States > Colorado (0.14)

Genre: Research Report > New Finding (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)

Add feedback

Fractional Naive Bayes (FNB): non-convex optimization for a parsimonious weighted selective naive Bayes classifier

Hue, Carine, Boullé, Marc

arXiv.org Machine LearningSep-17-2024

We study supervised classification for datasets with a very large number of input variables. The na\"ive Bayes classifier is attractive for its simplicity, scalability and effectiveness in many real data applications. When the strong na\"ive Bayes assumption of conditional independence of the input variables given the target variable is not valid, variable selection and model averaging are two common ways to improve the performance. In the case of the na\"ive Bayes classifier, the resulting weighting scheme on the models reduces to a weighting scheme on the variables. Here we focus on direct estimation of variable weights in such a weighted na\"ive Bayes classifier. We propose a sparse regularization of the model log-likelihood, which takes into account prior penalization costs related to each input variable. Compared to averaging based classifiers used up until now, our main goal is to obtain parsimonious robust models with less variables and equivalent performance. The direct estimation of the variable weights amounts to a non-convex optimization problem for which we propose and compare several two-stage algorithms. First, the criterion obtained by convex relaxation is minimized using several variants of standard gradient methods. Then, the initial non-convex optimization problem is solved using local optimization methods initialized with the result of the first stage. The various proposed algorithms result in optimization-based weighted na\"ive Bayes classifiers, that are evaluated on benchmark datasets and positioned w.r.t. to a reference averaging-based classifier.

algorithm, classifier, dataset, (17 more...)

arXiv.org Machine Learning

2409.111

Country:

North America > United States > California (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)
Europe > France (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Naive Bayes Classifiers and One-hot Encoding of Categorical Variables

Williams, Christopher K. I.

arXiv.org Machine LearningApr-28-2024

This paper investigates the consequences of encoding a $K$-valued categorical variable incorrectly as $K$ bits via one-hot encoding, when using a Na\"{\i}ve Bayes classifier. This gives rise to a product-of-Bernoullis (PoB) assumption, rather than the correct categorical Na\"{\i}ve Bayes classifier. The differences between the two classifiers are analysed mathematically and experimentally. In our experiments using probability vectors drawn from a Dirichlet distribution, the two classifiers are found to agree on the maximum a posteriori class label for most cases, although the posterior probabilities are usually greater for the PoB case.

classifier, posterior probability, probability, (14 more...)

arXiv.org Machine Learning

2404.1819

Country: North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

The Impact of Twitter Sentiments on Stock Market Trends

Mokhtari, Melvin, Seraj, Ali, Saeedi, Niloufar, Karshenas, Adel

arXiv.org Artificial IntelligenceFeb-14-2023

The Web is a vast virtual space where people can share their opinions, impacting all aspects of life and having implications for marketing and communication. The most up-to-date and comprehensive information can be found on social media because of how widespread and straightforward it is to post a message. Proportionately, they are regarded as a valuable resource for making precise market predictions. In particular, Twitter has developed into a potent tool for understanding user sentiment. This article examines how well tweets can influence stock symbol trends. We analyze the volume, sentiment, and mentions of the top five stock symbols in the S&P 500 index on Twitter over three months. Long Short-Term Memory, Bernoulli Na\"ive Bayes, and Random Forest were the three algorithms implemented in this process. Our study revealed a significant correlation between stock prices and Twitter sentiment.

machine learning, natural language, prediction, (17 more...)

arXiv.org Artificial Intelligence

2302.07244

Country:

Europe > United Kingdom > England (0.04)
Europe > Germany (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
Europe > Belgium (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Services (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(2 more...)

Add feedback

The Naive Bayes classifier: How it works

#artificialintelligenceJan-25-2022, 08:31:07 GMT

Classification algorithms try to predict the class or the label of the categorical target variable. A categorical variable typically represents qualitative data that has discrete values, such as pass/fail or low/medium/high, etc. Out of the many classification algorithms, the Naïve Bayes classifier is one of the simplest classification algorithms. The Naïve Bayes classifier is often used with large text datasets among other applications. The aim of this article is to explain how the Naive Bayes algorithm works.

class conditional probability, conditional probability, probability, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Estimating IRI based on pavement distress type, density, and severity: Insights from machine learning techniques

Qiao, Yu, Chen, Sikai, Alinizzi, Majed, Alamaniotis, Miltos, Labi, Samuel

arXiv.org Machine LearningOct-11-2021

Surface roughness is primary measure of pavement performance that has been associated with ride quality and vehicle operating costs. Of all the surface roughness indicators, the International Roughness Index (IRI) is the most widely used. However, it is costly to measure IRI, and for this reason, certain road classes are excluded from IRI measurements at a network level. Higher levels of distresses are generally associated with higher roughness. However, for a given roughness level, pavement data typically exhibits a great deal of variability in the distress types, density, and severity. It is hypothesized that it is feasible to estimate the IRI of a pavement section given its distress types and their respective densities and severities. To investigate this hypothesis, this paper uses data from in-service pavements and machine learning methods to ascertain the extent to which IRI can be predicted given a set of pavement attributes. The results suggest that machine learning can be used reliably to estimate IRI based on the measured distress types and their respective densities and severities. The analysis also showed that IRI estimated this way depends on the pavement type and functional class. The paper also includes an exploratory section that addresses the reverse situation, that is, estimating the probability of pavement distress type distribution and occurrence severity/extent based on a given roughness level.

distress type, pavement, roughness, (13 more...)

arXiv.org Machine Learning

2110.05413

Country:

North America > Canada > Alberta (0.14)
North America > United States > Texas > Bexar County > San Antonio (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
(13 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
(2 more...)

Add feedback

NAÏVE Bayes Classifier

#artificialintelligenceAug-19-2021, 18:20:09 GMT

Let us talk about Bayesian Network. Bayesian Network is a probablistic model represent a set of random variables and their conditional dependencies. This model can be represented using DAG (Directed Acrylic Graph) where nodes can be observable quantities, latent variables (not observable, inferred only) and not known parameters or hypothesis. DAG can help to understand the model in a easy manner. Edges in DAG represents conditional dependencies between nodes.

bayes classifier, bayes theorem, probability, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Model-Agnostic Algorithm for Bayes Error Determination in Binary Classification

Michelucci, Umberto, Sperti, Michela, Piga, Dario, Venturini, Francesca, Deriu, Marco A.

arXiv.org Artificial IntelligenceJul-24-2021

This paper presents the intrinsic limit determination algorithm (ILD Algorithm), a novel technique to determine the best possible performance, measured in terms of the AUC (area under the ROC curve) and accuracy, that can be obtained from a specific dataset in a binary classification problem with categorical features {\sl regardless} of the model used. This limit, namely the Bayes error, is completely independent of any model used and describes an intrinsic property of the dataset. The ILD algorithm thus provides important information regarding the prediction limits of any binary classification algorithm when applied to the considered dataset. In this paper the algorithm is described in detail, its entire mathematical framework is presented and the pseudocode is given to facilitate its implementation. Finally, an example with a real dataset is given.

algorithm, dataset, ild algorithm, (12 more...)

arXiv.org Artificial Intelligence

2107.11609

Country:

North America > United States > Massachusetts > Middlesex County > Framingham (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Italy > Piedmont > Turin Province > Turin (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.72)

Add feedback

Machine learning made easy with Python

#artificialintelligenceJan-31-2021, 23:00:13 GMT

Naïve Bayes is a classification technique that serves as the basis for implementing several classifier modeling algorithms. Naïve Bayes-based classifiers are considered some of the simplest, fastest, and easiest-to-use machine learning techniques, yet are still effective for real-world applications. Naïve Bayes is based on Bayes' theorem, formulated by 18th-century statistician Thomas Bayes. This theorem assesses the probability that an event will occur based on conditions related to the event. For example, an individual with Parkinson's disease typically has voice variations; hence such symptoms are considered related to the prediction of a Parkinson's diagnosis.

bayes, probability, ve bayes, (15 more...)

#artificialintelligence

Industry: Health & Medicine (0.58)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

How machine learning removes spam from your inbox

#artificialintelligenceDec-1-2020, 00:55:06 GMT

This article is part of "Deconstructing artificial intelligence," a series of posts that explore the details of how AI applications work. Of more than 300 billion emails sent every day, at least half are spam. Email providers have the huge task of filtering out the spam and making sure their users receive the messages that matter. The line between spam and non-spam messages is fuzzy, and the criteria change over time. From various efforts to automate spam detection, machine learning has so far proven to be the most effective and the favored approach by email providers.

algorithm, email, spam, (16 more...)

#artificialintelligence

Country: North America > United States > California > Orange County > Irvine (0.05)

Industry: Information Technology > Security & Privacy (0.70)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.77)

Add feedback